-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#992: added design doc for script options #477
base: master
Are you sure you want to change the base?
Conversation
3522853
to
0a2684d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you create these? If it is drawio, please use the extension drawio.png otherwise it is not clear how to change them and the tooling won't recognize them.
|
||
#### Legacy Parser | ||
|
||
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a narrow interpretation of arc42, the removal would be part of the runtime view and not the building blocks.
|
||
Tags: V2 | ||
|
||
### General Parser Integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the DB
b6dca79
to
207e27f
Compare
@@ -0,0 +1,27 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should at some Point replace this with a nox session, please create an issue
## Constraints | ||
|
||
- The parser implementation must be in C++. | ||
- The chosen parser implementation is [ctpg](https://github.com/peter-winter/ctpg), which supports the definition of Lexer and Parser Rules in C++ code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a constraint, but a decision that follows from the constraints
|
||
![Components](diagrams/OveralScriptOptionalsBuildingBlocks.drawio.png) | ||
|
||
At the very high level there can be distinguished between the generic "Script Options Parser" module which parses a UDF script code and returns the found script options, and the "Script Options Parser Handler" which converts the Java UDF specific script options. In both modules there are specific implementation for the legacy parser and the new CTPG based parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reason, why we need to implementation
|
||
### Script Options Parser | ||
|
||
The parser component can be used to parse any script code (Java, Python, R) for any script options. It provides simplistic interfaces, which are different between the two versions, which accept the script code as input and return the found script option(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reason, why the interfaces are different. Here, they work inherently different
|
||
As the parser needs to find script options in any given script code, the generated parser must accept any strings which are not script options and ignore those. In order to achieve this, the lexer rules need to be as simple as possible, in order to avoid collisions. | ||
|
||
It is important to emphasize that in contrast to the legacy parser, the caller is responsible for removing the script options from the script code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a good reason?
|
||
![LegacyParserHandler](diagrams/LegacyParserHandler.drawio.png) | ||
|
||
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to class `ConverterLegacy`, which uses a common implementation for the conversion of the options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to class `ConverterLegacy`, which uses a common implementation for the conversion of the options. | |
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to the class `ConverterLegacy`, which uses a common implementation for the conversion of the options. |
common implementation is misleading, because it suggests both handler use it
|
||
![CTPGParserHandler](diagrams/CTPGParserHandler.drawio.png) | ||
|
||
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG basedParser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG basedParser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. | |
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG based Parser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. |
common again
`ScriptOptionsLinesParserCTPG` uses an instance of `ScriptImporter` to import foreign scripts. Because the new parser collects all script options at once, but backwards compatibility with existing UDF scripts must be ensured, there is an additional level of complexity in the import script algorithm. The algorithm is described in the following pseudocode snippet: | ||
``` | ||
function import(script_code, options_map) | ||
import_option = options_map.find("import") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import_option = options_map.find("import") | |
import_options = options_map.find("import") |
Should this be plural?
static void doSomething() {} | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add the expected results
|
||
### Parser Implementation V1 | ||
|
||
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the | |
The legacy parser (V1) parser searches for one specific script option. The parser starts from the beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the sentence is missing
### Lexer and Parser Rules Option | ||
`dsn~lexer-parser-rules~1` | ||
|
||
Lexer and Parser rules to recognize `%optionKey`, `optionValue`, with whitespace characters as separator. The Parser rules will define the grammar to correctly identify Script Options, manage multiple options with the same key, and handle duplicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention simple lexer rules to deal with whitespace and escaping correct
### Ignore lines without script options | ||
`dsn~ignore-lines-without-script-options~1` | ||
|
||
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character. | |
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character after only whitespaces. |
|
||
Tags: V2 | ||
|
||
### Lexer and Parser Rules Whitespace Sequences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to continue here
- '\f' => <form feed> character | ||
- '\v' => <vertical tab> character | ||
|
||
Implement rules which replace those white space escape tokens only at the beginning of an option value. Add parser rules to ignore those tokens in anything else, which is not a script option. Those token are not expected to be part of an option key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement rules which replace those white space escape tokens only at the beginning of an option value. Add parser rules to ignore those tokens in anything else, which is not a script option. Those token are not expected to be part of an option key. | |
Implement rules which replace those white space escape tokens only at the beginning of an option value. Add parser rules to leave those tokens as is in anything else, which is not a script option. Those token are not expected to be part of an option key. |
### Lexer and Parser Rules Escape Sequences | ||
`dsn~lexer-parser-rules-escape-sequences~1` | ||
|
||
Define the Lexer rules to tokenize '\n', '\r', '\; sequences, and parser rules to replace those sequences with <line feed>, <carriage return> or ';' characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define the Lexer rules to tokenize '\n', '\r', '\; sequences, and parser rules to replace those sequences with <line feed>, <carriage return> or ';' characters. | |
Define the Lexer rules to tokenize '\n', '\r', '\;' sequences, and parser rules to replace those sequences with <line feed>, <carriage return> or ';' characters. |
`dsn~lexer-parser-rules-escape-sequences~1` | ||
|
||
Define the Lexer rules to tokenize '\n', '\r', '\; sequences, and parser rules to replace those sequences with <line feed>, <carriage return> or ';' characters. | ||
Implement rules which replace those token at any location in an option value. Add parser rules to ignore those tokens in anything else, which is not a script option. Those token are not expected to be part of an option key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement rules which replace those token at any location in an option value. Add parser rules to ignore those tokens in anything else, which is not a script option. Those token are not expected to be part of an option key. | |
Implement rules which replace those token at any location in an option value. Add parser rules to leave those tokens as is in anything else, which is not a script option. Those token are not expected to be part of an option key. |
### Script Option Removal Mechanism | ||
`dsn~script-option-removal~1` | ||
|
||
Implement a method in class `ScriptOptionLinesParserCTPG` which removes *all* identified Script Options from the original script code at once. This method be executed after the import scripts are replaced. The algorithm must replace the script options in reverse order in order to maintain consistency of internal list of positions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement a method in class `ScriptOptionLinesParserCTPG` which removes *all* identified Script Options from the original script code at once. This method be executed after the import scripts are replaced. The algorithm must replace the script options in reverse order in order to maintain consistency of internal list of positions. | |
Implement a method in class `ScriptOptionLinesParserCTPG` which removes *all* identified Script Options from the original script code at once. This method should be executed after the import scripts are replaced. The algorithm must replace the script options in reverse order in order to maintain consistency of internal list of positions. |
### Unknown Options Check | ||
`dsn~script-option-unknown-options-behvaior-v2~1` | ||
|
||
Implement a check in class `ScriptOptionLinesParserCTPG` which verifies that only known script options are found. Otherwise, raise an exception with information about the position and name of the unknown option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, after a second thought this gets nasty, because the supported options depend on the handlers, but that a handler doesn't support certain options doesn't mean they are illegal, it is still possible that a subsequent handler can handle them. But, the handlers are independent of each other and don't know which options they support. This means, we need centralize all supported options. This means also whenever we are adding a new option, we have to add it to the handler and the parser.
Example is the DB parser and Java parser, the DB parser has no idea what the Java parser supports.
|
||
Tags: V2 | ||
|
||
### Java %scriptclass Option Handling in Design |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in Design?
### Java %scriptclass Option Handling in Design | ||
`dsn~java-scriptclass-option-handling~1` | ||
|
||
Implement a function in the `Converter` class which adds the script class option to the JVM Options list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement a function in the `Converter` class which adds the script class option to the JVM Options list. | |
Implement a function in the `ConverterV2` class which adds the script class option to the JVM Options list. |
?
### Java %jar Option Collection | ||
`dsn~java-jar-option-collection~1` | ||
|
||
Implement an algorithm in class `ConverterV2` to split the given Jar option value by colon (':') and then collect the found Jar options in a list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add our links about what are legal classpaths
related to exasol/script-languages-release#992